A hierarchical fuzzy cluster ensemble approach and its application to big data clustering
نویسندگان
چکیده
Cluster ensembles organically integrate individual component methods which may utilise different parameter settings and features, and which may themselves be generated on the basis of different representations and learning mechanisms. Such a technique offers an effective means for aggregating multiple clustering results in order to improve the overall clustering accuracy and robustness. Many topics regarding cluster ensembles have been proposed and promising results are gained in the literature. To reinforce such development, this paper presents another cluster ensemble approach for fuzzy clustering, with an aim to be applied for clustering of big data. The proposed algorithm first generates fuzzy base clusters with respect to each data feature and then, employs a fuzzy hierarchical graph to represent the relationships between the resulting base clusters. Whilst the work employs fuzzy c-means and hierarchical clustering in generating base cluster and implementing consensus function respectively, when applied to large datasets it has lower time complexity than the original fuzzy c-means and hierarchical clustering. The resultant ensemble clustering mechanism is tested against traditional clustering methods on various benchmark datasets. Experimental results demonstrate that it generally outperforms crisp cluster ensembles and single linkage agglomerative clustering, in terms of accuracy in conjunction with time efficiency, thereby showing that it has the potential for application in clustering big data.
منابع مشابه
High-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملA new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble
An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The inp...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملA Fuzzy C-means Algorithm for Clustering Fuzzy Data and Its Application in Clustering Incomplete Data
The fuzzy c-means clustering algorithm is a useful tool for clustering; but it is convenient only for crisp complete data. In this article, an enhancement of the algorithm is proposed which is suitable for clustering trapezoidal fuzzy data. A linear ranking function is used to define a distance for trapezoidal fuzzy data. Then, as an application, a method based on the proposed algorithm is pres...
متن کاملGraph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members
Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Intelligent and Fuzzy Systems
دوره 28 شماره
صفحات -
تاریخ انتشار 2015